Remove HMP from optimum-habana #349

jwieczorekhabana · 2023-08-18T13:28:50Z

HMP is deprecated in favor of PyTorch autocast

removed hmp usage
removed setting autocast env variables through GaudiConfig
updated tests
updated docs
updated README.md files

regisss · 2023-08-18T14:05:40Z

@jwieczorekhabana Thanks for opening this PR! I'll review it in the beginning of next week and I'll try to accelerate on merging the autocast PRs.

regisss

@jwieczorekhabana I left several comments, the main point being that #308 enables users to specify their own op lists in Gaudi config files so this should be documented, tested, etc (which should basically consist in changing a few variable names).

For OPT, I need to check if we need to disable autocast for this specific line.

Note that we should not merge this PR before we manage to make Wav2Vec2 work with autocast. It is being investigated but no fix has been merged yet.

regisss · 2023-09-12T16:53:40Z

optimum/habana/diffusers/pipelines/pipeline_utils.py

-                elif self.gaudi_config.use_torch_autocast:
-                    # Open temporary files to write mixed-precision ops
-                    with tempfile.NamedTemporaryFile() as hmp_bf16_file:
-                        with tempfile.NamedTemporaryFile() as hmp_fp32_file:
-                            self.gaudi_config.write_bf16_fp32_ops_to_text_files(
-                                hmp_bf16_file.name,
-                                hmp_fp32_file.name,
-                            )
-                            os.environ["LOWER_LIST"] = str(hmp_bf16_file)
-                            os.environ["FP32_LIST"] = str(hmp_fp32_file)


We should keep this block as it enables to specify custom bf16/fp32 ops for autocast

optimum/habana/transformers/modeling_utils.py

regisss · 2023-09-12T16:58:28Z

optimum/habana/transformers/models/gpt2/modeling_gpt2.py

Actually we still need this for GPT2 with autocast, see #308 that I just merged.
Can you rebase your branch and replace the "HMP" mention in the docstring please?

regisss · 2023-09-12T17:01:10Z

optimum/habana/transformers/gaudi_configuration.py

-    def write_bf16_fp32_ops_to_text_files(
-        self,
-        path_to_bf16_file: Path,
-        path_to_fp32_file: Path,
-    ):
-        for path, ops in zip(
-            [Path(path_to_bf16_file), Path(path_to_fp32_file)], [self.hmp_bf16_ops, self.hmp_fp32_ops]
-        ):
-            with path.open("w") as text_file:
-                # writelines does not add new lines after each element so "\n" is inserted
-                text_file.writelines(op + "\n" for op in ops)


We still need this method to be able to specify autocast custom op lists in a Gaudi config

regisss · 2023-09-12T17:10:00Z

tests/test_gaudi_configuration.py

-    def test_write_bf16_fp32_ops_to_text_files(self):
-        gaudi_config = GaudiConfig()
-
-        with tempfile.NamedTemporaryFile() as bf16_file:
-            with tempfile.NamedTemporaryFile() as fp32_file:
-                gaudi_config.write_bf16_fp32_ops_to_text_files(
-                    bf16_file.name,
-                    fp32_file.name,
-                )
-
-                self.assertTrue(
-                    filecmp.cmp(
-                        bf16_file.name,
-                        BF16_OPS_REFERENCE_FILE,
-                        shallow=False,
-                    )
-                )
-                self.assertTrue(
-                    filecmp.cmp(
-                        fp32_file.name,
-                        FP32_OPS_REFERENCE_FILE,
-                        shallow=False,
-                    )
-                )


We should keep this test as write_bf16_fp32_ops_to_text_files will still be used to specify custom op lists.
I propose to instantiate a Gaudi config such as:

gaudi_config = GaudiConfig( autocast_bf16_ops=[ "add", "addmm", "bmm", "div", "dropout", "gelu", "iadd", "linear", "layer_norm", "matmul", "mm", "rsub", "softmax", "truediv", ], autocast_fp32_ops=[ "embedding", "nll_loss", "log_softmax", ], )

regisss · 2023-09-12T17:15:22Z

docs/source/package_reference/gaudi_config.mdx

- `hmp_bf16_ops` enables to specify the Torch operations that should be computed in *bf16*. You can find more information about casting rules [here](https://docs.habana.ai/en/latest/PyTorch/PyTorch_Mixed_Precision/PT_Mixed_Precision.html#basic-design-rules).
- `hmp_fp32_ops` enables to specify the Torch operations that should be computed in *fp32*. You can find more information about casting rules [here](https://docs.habana.ai/en/latest/PyTorch/PyTorch_Mixed_Precision/PT_Mixed_Precision.html#basic-design-rules).


I think we should mention:

use_torch_autocast but saying that --bf16 should be favored as use_torch_autocast is used to define a good pre-defined config

autocast_bf16_ops and autocast_fp32_ops as Add support for autocast custom ops in GaudiTrainer #308 enables users to specify cutom op lists but saying that the default should work for most models

As discussed by email, regarding autocast_bf16_ops and autocast_fp32_ops, I'm fine with saying that the env variable way should be favored. But they should still be documented.

regisss · 2023-09-12T17:16:32Z

docs/source/usage_guides/accelerate_training.mdx

@@ -57,44 +57,16 @@ To not take them into account in the computation of the throughput at the end of
 ## Mixed-Precision Training

 Mixed-precision training enables to compute some operations using lighter data types to accelerate training.
-Habana Mixed Precision (HMP) proposes to mix *fp32* and *bf16* operations.
+Optimum-Habana enables mixed precision training in a similar fasion as HuggingFace transofrmers.


Suggested change

Optimum-Habana enables mixed precision training in a similar fasion as HuggingFace transofrmers.

Optimum Habana enables mixed precision training in a similar fashion as 🤗 Transformers.

regisss · 2023-09-12T17:17:38Z

docs/source/usage_guides/accelerate_training.mdx

-<Tip warning={true}>
-
-Please refer to the [list of supported PyTorch operators](https://docs.habana.ai/en/latest/PyTorch/Pytorch_Operators/Pytorch_Operators.html) beforehand to make sure the ones you are interested in are compatible with *bf16*.

-</Tip>


I would keep this

But those operators are incompatible with autocast. HMP and autocast operate on different software levels. Please see: https://docs.habana.ai/en/latest/PyTorch/PyTorch_Mixed_Precision/Autocast.html#override-options

No problem, we don't have to keep the same operators. Maybe it will just be easier to refer to GPT2's Gaudi config.

docs/source/usage_guides/accelerate_training.mdx

regisss · 2023-09-12T17:20:02Z

docs/source/usage_guides/accelerate_training.mdx

-Then, you can specify which operators to compute in *bf16* with `"hmp_bf16_ops"` and which operators to compute in *fp32* with `"hmp_fp32_ops"`.
-If these operators are not specified, their default values are set to be the ones written in the [Gaudi configuration file of BERT](https://huggingface.co/Habana/bert-large-uncased-whole-word-masking/blob/main/gaudi_config.json), which is a good starting point for applying HMP:
-```
-"hmp_bf16_ops": [
-    "add",
-    "addmm",
-    "bmm",
-    "div",
-    "dropout",
-    "gelu",
-    "iadd",
-    "linear",
-    "layer_norm",
-    "matmul",
-    "mm",
-    "rsub",
-    "softmax",
-    "truediv"
-],
-"hmp_fp32_ops": [
-    "embedding",
-    "nll_loss",
-    "log_softmax"
-]
-```


I would still keep a part of this to show how to specify custom op lists. We can add a link to the GPT2 Gaudi config when it is updated.

But shouldn't users provide custom lists in a similar way to other training demos outside of HuggingFace? We can keep those in GaudiConfig to make sure they are optimized for specific model.

IMO users should be able to do both because those already used to Optimum Habana probably have Gaudi configs with custom op lists, so switching to Autocast will be easy and they won't be confused.

HuggingFaceDocBuilderDev · 2023-09-15T13:26:05Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

regisss

Thanks for updating the PR! I left a few comments.

Could you also run the following to make the code style check pass please?

pip install black ruff
make style

docs/source/package_reference/gaudi_config.mdx

docs/source/usage_guides/accelerate_training.mdx

tests/test_gaudi_configuration.py

optimum/habana/transformers/models/gpt2/modeling_gpt2.py

optimum/habana/diffusers/pipelines/pipeline_utils.py

regisss

LGTM!

Let's wait to know what happens with Wav2Vec2 training before merging anything cc @libinta

docs/source/usage_guides/accelerate_training.mdx

regisss · 2023-11-15T13:40:29Z

@jwieczorekhabana will the Gaudi config for Wav2Vec2 be the following?

{
  "use_fused_adam": true,
  "use_fused_clip_norm": true,
  "use_torch_autocast": true
}

Or will there be any specific bf16 ops?

jwieczorekhabana · 2023-11-16T06:46:51Z

@jwieczorekhabana will the Gaudi config for Wav2Vec2 be the following?
{
  "use_fused_adam": true,
  "use_fused_clip_norm": true,
  "use_torch_autocast": true
}
Or will there be any specific bf16 ops?

That'll be it. There is a hanging PR on HuggingFace https://huggingface.co/Habana/wav2vec2/discussions/2/files that can be merged after 1.13 release.
Similarly for gpt2 there is a closed PR that can be reopened https://huggingface.co/Habana/gpt2/discussions/4/files

Release 1.13 addresses issues both of those topologies had.

regisss · 2023-11-16T08:17:50Z

Cool, I'll merge them when 1.13 is released then 👍

regisss · 2023-11-19T11:56:04Z

@jwieczorekhabana Can you rebase this branch on main? There are merge conflicts in tests/test_diffusers.py so GitHub won't let me merge the PR when 1.13 is released.

HMP is deprecated in favor of PyTorch autocast - removed hmp usage - removed setting autocast env variables through GaudiConfig - updated tests - updated docs - updated README.md files

- restored test_gaudi_config imports

Co-authored-by: regisss <[email protected]>

jwieczorekhabana · 2023-11-22T07:58:12Z

@jwieczorekhabana Can you rebase this branch on main? There are merge conflicts in tests/test_diffusers.py so GitHub won't let me merge the PR when 1.13 is released.

@regisss Done

regisss · 2023-11-22T16:54:00Z

@jwieczorekhabana It seems there are still some merge conflicts in tests/test_diffusers.py

Change-Id: Ibbd216b9b6b5c9104b7dce3021f231eda3748704

regisss · 2023-11-24T13:20:54Z

@jwieczorekhabana I pushed a few commits to solve the CI that was not passing

regisss reviewed Sep 12, 2023

View reviewed changes

regisss mentioned this pull request Sep 13, 2023

Add support for autocast custom ops in GaudiTrainer #308

Merged

3 tasks

jwieczorekhabana force-pushed the dev/jwieczorekhabana/autocast branch from cba7f71 to 184acc3 Compare September 15, 2023 09:56

jwieczorekhabana requested review from ZhaiFeiyue, skaulintel and ANSHUMAN87 as code owners September 15, 2023 09:56

regisss added the run-test Run CI for PRs from external contributors label Sep 15, 2023

regisss reviewed Sep 15, 2023

View reviewed changes

regisss added run-test Run CI for PRs from external contributors and removed run-test Run CI for PRs from external contributors labels Sep 19, 2023

regisss approved these changes Sep 19, 2023

View reviewed changes

docs/source/usage_guides/accelerate_training.mdx Outdated Show resolved Hide resolved

regisss added run-test Run CI for PRs from external contributors and removed run-test Run CI for PRs from external contributors labels Sep 19, 2023

jwieczorekhabana force-pushed the dev/jwieczorekhabana/autocast branch from d668dfb to fdb4d92 Compare November 6, 2023 11:41

regisss mentioned this pull request Nov 10, 2023

remove hmp #518

Closed

3 tasks

regisss added the synapse 1.13 label Nov 14, 2023

jwieczorekhabana and others added 4 commits November 22, 2023 09:56

Remove HMP from optimum-habana

b8b9645

HMP is deprecated in favor of PyTorch autocast - removed hmp usage - removed setting autocast env variables through GaudiConfig - updated tests - updated docs - updated README.md files

Revome HMP - minor fixes and docs edit

a4a8ebf

Formatting code

a3d4011

- restored test_gaudi_config imports

Update docs/source/usage_guides/accelerate_training.mdx

cc84d33

Co-authored-by: regisss <[email protected]>

jwieczorekhabana force-pushed the dev/jwieczorekhabana/autocast branch from 1f164dd to cc84d33 Compare November 22, 2023 07:57

regisss added run-test Run CI for PRs from external contributors and removed run-test Run CI for PRs from external contributors labels Nov 22, 2023

Make style and merge fixes

676a348

Change-Id: Ibbd216b9b6b5c9104b7dce3021f231eda3748704

regisss added run-test Run CI for PRs from external contributors and removed run-test Run CI for PRs from external contributors labels Nov 24, 2023

regisss added 4 commits November 24, 2023 10:04

Update gaudi_configuration.py

23d7bb1

Replace last occurrences of hmp_* by autocast_*

55aa0cd

Remove occurrences of use_habana_mixed_precision

690201e

Fix

d817217

regisss added run-test Run CI for PRs from external contributors and removed run-test Run CI for PRs from external contributors labels Nov 24, 2023

regisss approved these changes Nov 24, 2023

View reviewed changes

regisss merged commit 2129f91 into huggingface:main Nov 24, 2023
11 of 12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove HMP from optimum-habana #349

Remove HMP from optimum-habana #349

jwieczorekhabana commented Aug 18, 2023

regisss commented Aug 18, 2023

regisss left a comment

regisss Sep 12, 2023

regisss Sep 12, 2023

regisss Sep 12, 2023

regisss Sep 12, 2023

regisss Sep 12, 2023

regisss Sep 15, 2023

regisss Sep 12, 2023

regisss Sep 12, 2023

jwieczorekhabana Sep 15, 2023

regisss Sep 15, 2023

regisss Sep 12, 2023

jwieczorekhabana Sep 15, 2023

regisss Sep 15, 2023

HuggingFaceDocBuilderDev commented Sep 15, 2023

regisss left a comment

regisss left a comment

regisss commented Nov 15, 2023

jwieczorekhabana commented Nov 16, 2023

regisss commented Nov 16, 2023

regisss commented Nov 19, 2023

jwieczorekhabana commented Nov 22, 2023

regisss commented Nov 22, 2023

regisss commented Nov 24, 2023

		- `hmp_bf16_ops` enables to specify the Torch operations that should be computed in bf16. You can find more information about casting rules [here](https://docs.habana.ai/en/latest/PyTorch/PyTorch_Mixed_Precision/PT_Mixed_Precision.html#basic-design-rules).
		- `hmp_fp32_ops` enables to specify the Torch operations that should be computed in fp32. You can find more information about casting rules [here](https://docs.habana.ai/en/latest/PyTorch/PyTorch_Mixed_Precision/PT_Mixed_Precision.html#basic-design-rules).

	Optimum-Habana enables mixed precision training in a similar fasion as HuggingFace transofrmers.
	Optimum Habana enables mixed precision training in a similar fashion as 🤗 Transformers.

Remove HMP from optimum-habana #349

Remove HMP from optimum-habana #349

Conversation

jwieczorekhabana commented Aug 18, 2023

regisss commented Aug 18, 2023

regisss left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Sep 15, 2023

regisss left a comment

Choose a reason for hiding this comment

regisss left a comment

Choose a reason for hiding this comment

regisss commented Nov 15, 2023

jwieczorekhabana commented Nov 16, 2023

regisss commented Nov 16, 2023

regisss commented Nov 19, 2023

jwieczorekhabana commented Nov 22, 2023

regisss commented Nov 22, 2023

regisss commented Nov 24, 2023